Probit Normal Correlated Topic Models

نویسندگان

  • Xingchen Yu
  • Ernest Fokoué
چکیده

The logistic normal distribution has recently been adapted via the transformation of multivariate Gaussian variables to model the topical distribution of documents in the presence of correlations among topics. In this paper, we propose a probit normal alternative approach to modelling correlated topical structures. Our use of the probit model in the context of topic discovery is novel, as many authors have so far concentrated solely of the logistic model partly due to the formidable inefficiency of the multinomial probit model even in the case of very small topical spaces. We herein circumvent the inefficiency of multinomial probit estimation by using an adaptation of the diagonal orthant multinomial probit in the topic models context, resulting in the ability of our topic modelling scheme to handle corpuses with a large number of latent topics. An additional and very important benefit of our method lies in the fact that unlike with the logistic normal model whose non-conjugacy leads to the need for sophisticated sampling schemes, our approach exploits the natural conjugacy inherent in the auxiliary formulation of the probit model to achieve greater simplicity. The application of our proposed scheme to a well known Associated Press corpus not only helps discover a large number of meaningful topics but also reveals the capturing of compellingly intuitive correlations among certain topics. Besides, our proposed approach lends itself to even further scalability thanks to various existing high performance algorithms and architectures capable of handling millions of documents.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modelling of Correlated Ordinal Responses, by Using Multivariate Skew Probit with Different Types of Variance Covariance Structures

In this paper, a multivariate fundamental skew probit (MFSP) model is used to model correlated ordinal responses which are constructed from the multivariate fundamental skew normal (MFSN) distribution originate to the greater flexibility of MFSN. To achieve an appropriate VC structure for reaching reliable statistical inferences, many types of variance covariance (VC) structures are considered ...

متن کامل

The Analysis of Bayesian Probit Regression of Binary and Polychotomous Response Data

The goal of this study is to introduce a statistical method regarding the analysis of specific latent data for regression analysis of the discrete data and to build a relation between a probit regression model (related to the discrete response) and normal linear regression model (related to the latent data of continuous response). This method provides precise inferences on binary and multinomia...

متن کامل

Parameterization of multivariate random effects models for categorical data.

Alternative parameterizations and problems of identification and estimation of multivariate random effects models for categorical responses are investigated. The issues are illustrated in the context of the multivariate binomial logit-normal (BLN) model introduced by Coull and Agresti (2000, Biometrics 56, 73-80). We demonstrate that the BLN model is poorly identified unless proper restrictions...

متن کامل

Mixture of Normals Probit Models

This paper generalizes the normal probit model of dichotomous choice by introducing mixtures of normals distributions for the disturbance term. By mixing on both the mean and variance parameters and by increasing the number of distributions in the mixture these models effectively remove the normality assumption and are much closer to semiparametric models. When a Bayesian approach is taken, the...

متن کامل

Probit-Based Traffic Assignment: A Comparative Study between Link-Based Simulation Algorithm and Path-Based Assignment and Generalization to Random-Coefficient Approach

Probabilistic approach of traffic assignment has been primarily developed to provide a more realistic and flexible theoretical framework to represent traveler’s route choice behavior in a transportation network. The problem of path overlapping in network modelling has been one of the main issues to be tackled. Due to its flexible covariance structure, probit model can adequately address the pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1410.0908  شماره 

صفحات  -

تاریخ انتشار 2014